There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
网络体系结构设计的持续进步导致了各种具有挑战性的计算机视觉任务的深入学习取得的显着成就。同时,神经体系结构搜索(NAS)的开发提供了有前途的方法来自动化网络体系结构的设计,从而获得较低的预测错误。最近,深入学习的新兴应用程序方案提高了考虑多个设计标准的网络体系结构的更高需求:参数/浮点操作的数量以及推理延迟等。从优化的角度来看,涉及多个设计标准的NAS任务是本质上多目标优化问题。因此,采用进化的多目标优化(EMO)算法来解决它们是合理的。尽管如此,仍然存在一个明显的差距,将相关研究沿着这一途径限制:一方面,从优化的角度出发,缺乏NAS任务的一般问题。另一方面,在NAS任务上对EMO算法进行基准评估存在挑战。弥合差距:(i)我们将NAS任务制定为一般的多目标优化问题,并从优化的角度分析复杂特征; (ii)我们提出了一条端到端管道,称为$ \ texttt {evoxbench} $,以生成Emo算法的基准测试问题,以有效运行 - 无需GPU或Pytorch/tensorflow; (iii)我们实例化了两个测试套件,全面涵盖了两个数据集,七个搜索空间和三个硬件设备,最多涉及八个目标。基于上述内容,我们使用六种代表性的EMO算法验证了提出的测试套件,并提供了一些经验分析。 $ \ texttt {evoxBench} $的代码可从$ \ href {https://github.com/emi-group/evoxbench} {\ rm {there}} $。
translated by 谷歌翻译
随着在充满挑战的环境中越来越需要多机器人探索未知区域的需求,需要有效的协作探索策略来实现此类壮举。可以部署基于边界的快速探索随机树(RRT)探索来探索未知的环境。然而,它的贪婪行为导致多个机器人探索收入最高的地区,从而导致勘探过程中大规模重叠。为了解决这个问题,我们提出了基于时间内存的RRT(TM-RRT)探索策略,用于多机器人在未知环境中执行强大的探索。它根据每个机器人的相对位置计算分配的每个边界的自适应持续时间,并计算边界的收入。此外,每个机器人都配备了由分配的边界和舰队共享的内存,以防止重复对同一边界的分配。通过模拟和实际部署,我们通过在25.0m x 540m(1350.0m2)区域完成勘探,展示了TM-RRT勘探策略的鲁棒性,而常规的RRT勘探策略则不足。
translated by 谷歌翻译
物联网设备越来越多地通过神经网络模型实施,以启用智能应用程序。从环境环境中收集能源的能源收集(EH)技术是电池可为这些设备供电的有前途的替代方法,因为维护成本较低和能源的广泛可用性。但是,能量收割机提供的功率很低,并且具有不稳定性的固有缺点,因为它随环境环境而变化。本文提出了EVE,EVE是一种自动化机器学习(AUTOML)共同探索框架,以搜索具有共享权重的所需的多模型,以进行能源收集的物联网设备。这些共享模型显着降低了记忆足迹,具有不同级别的模型稀疏性,延迟和准确性,以适应环境变化。进一步开发了有效的实施实施体系结构,以有效地执行设备上的每个模型。提出了一种运行时模型提取算法,该算法在触发特定模型模式时以可忽略的开销检索单个模型。实验结果表明,EVE生成的神经网络模型平均比没有修剪和共享的基线模型快2.5倍倍权重。
translated by 谷歌翻译
已经发现深层神经网络容易受到对抗攻击的影响,从而引起了对安全敏感的环境的潜在关注。为了解决这个问题,最近的研究从建筑的角度研究了深神经网络的对抗性鲁棒性。但是,搜索深神经网络的体系结构在计算上是昂贵的,尤其是当与对抗性训练过程相结合时。为了应对上述挑战,本文提出了双重主体神经体系结构搜索方法。首先,我们制定了NAS问题,以增强深度神经网络的对抗性鲁棒性为多目标优化问题。具体而言,除了低保真绩效预测器作为第一个目标外,我们还利用辅助目标 - 其值是经过高保真评估训练的替代模型的输出。其次,我们通过结合三种性能估计方法,即参数共享,低保真评估和基于替代的预测指标来降低计算成本。在CIFAR-10,CIFAR-100和SVHN数据集上进行的广泛实验证实了所提出的方法的有效性。
translated by 谷歌翻译
机器学习(ML)模型需要经常在改变各种应用场景中更改数据集,包括数据估值和不确定量化。为了有效地重新培训模型,已经提出了线性近似方法,例如影响功能,以估计数据变化对模型参数的影响。但是,对于大型数据集的变化,这些方法变得不准确。在这项工作中,我们专注于凸起的学习问题,并提出了一般框架,用于学习使用神经网络进行不同训练集的优化模型参数。我们建议强制执行预测的模型参数,以通过正则化技术遵守最优性条件并保持效用,从而显着提高泛化。此外,我们严格地表征了神经网络的表现力,以近似凸起问题的优化器。经验结果展示了与最先进的准确高效的模型参数估计中提出的方法的优点。
translated by 谷歌翻译
Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.
translated by 谷歌翻译
We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in $\mathbb{R}^d$, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error $\epsilon_{\text{opt}}$ with $d^{1/3}\epsilon_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For $\epsilon_{\text{opt}} \in [d^{-1}, d^{-1/4}]$, our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. We give an $(\epsilon_{\text{dp}}, \delta)$-differentially private algorithm which, given $n$ samples of Lipschitz loss functions, obtains near-optimal optimization error and makes $\min(n, n^2\epsilon_{\text{dp}}^2 d^{-1}) + \min(n^{4/3}\epsilon_{\text{dp}}^{1/3}, (nd)^{2/3}\epsilon_{\text{dp}}^{-1})$ queries to the gradients of these functions. In the regime $d \le n \epsilon_{\text{dp}}^{2}$, where privacy comes at no cost in terms of the optimal loss up to constants, our algorithm uses $n + (nd)^{2/3}\epsilon_{\text{dp}}^{-1}$ queries and improves recent advancements of [KLL21, AFKT21]. In the moderately low-dimensional setting $d \le \sqrt n \epsilon_{\text{dp}}^{3/2}$, our query complexity is near-linear.
translated by 谷歌翻译
Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.
translated by 谷歌翻译